Typeset with jpsj2.cls <ver.l.2> 



Full Paper 



Theory of Recurrent Neural Network with Common Synaptic Inputs 

Masaki Kawamura^ * Michiko Yamana^ and Masato Okada^'"^'^ 

^Faculty of Science, Yamaguchi University, 1677-1 Yoshida, Yamaguchi 753-8512 
^ System Engineering Research Laboratory, Central Research Institute of Electric Power Industry, 

2-11-1 Iwadokita, Komae, Tokyo 201-8511 
^ Department of Complexity Science and Engineering, Graduate School of Frontier Sciences, The 
University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8562 
^ Laboratory for Mathematical Neuroscience, RIKEN Brain Science Institute, 2-1 Hirosawa, Wako, 

Saitama 351-0198 
^ "Intelligent Cooperation and Control, " PRESTO, JST 

We discuss the effects of common synaptic inputs in a recurrent neural network. Because 
of the effects of these common synaptic inputs, the correlation between neural inputs cannot 
be ignored, and thus the network exhibits sample dependence. Networks of this type do not 
have well-defined thermodynamic limits, and self-averaging breaks down. We therefore need 
to develop a suitable theory without relying on these common properties. While the effects of 
the common synaptic inputs have been analyzed in layered neural networks, it was apparently 
difficult to analyze these effects in recurrent neural networks due to feedback connections. 
We investigated a sequential associative memory model as an example of recurrent networks 
and succeeded in deriving a macroscopic dynamical description as a recurrence relation form 
of a probability density function. 

KEYWORDS: common synaptic inputs, recurrent neural networks, probability density function, 
correlated firings, sample dependence 



1. Introduction 

Synfire chains, namely, synchronous firings of neurons, can be observed in the brain. ^ 
Diesmann et al."^ and Cateau and Fukai^ discussed conditions for propagating the synchronous 
firings between layers in layered neural networks, while Amari et al. considered common 
synaptic inputs to neurons in the layered neural networks and discussed correlated firings of 
neurons.^ These studies are based on theoretical models, and the biological structure of the 
synfire chains or the common synaptic inputs remain to be elucidated. In order to understand 
the structure, theoretical models must be analyzed. We therefore discuss the effects of the 
common synaptic inputs on an associative memory model from a theoretical viewpoint. In 
order to analyze these effects, the structure of our model is simple, unlike that for synfire 
chain models. 
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Using the common synaptic inputs, the sums of inputs to neurons are correlated. The 
firings of the neurons arc, therefore, also correlated, and with an infinite number of neurons 
there is no thermodynamic limit and sample dependence appears.^' ^ All solvable models, 
including neural networks, that have been discussed in the statistical mechanics literature have 
been analyzed by applying the independence of units or neurons at the thermodynamic limit. 
There are few theoretical approaches, however, that address sample dependence. Yamana and 
Okada^ introduced uniform common synaptic inputs that depend on preneurons to the layered 
associative memory model, and was able to derive the probability density function (PDF) for 
its macroscopic states. This PDF allows for the analysis of dynamics with sample dependence 
in the layered associative memory model. 

In the layered associative memory model, since synaptic connections within a layer are 
independent of each other, no correlation occurs between common synaptic inputs on different 
layers. However, in recurrent neural networks that contain feedback connections, correlations 
between common synaptic inputs at different times cannot be ignored. Theoretical analysis 
in such cases might be rendered difficult, and in fact it is indeed hard to analyze qualitatively 
the effect of common synaptic inputs in an autoassociative memory model. 

In recurrent neural networks, correlated connections can also be found in asymmetric 
synaptic connections,''''^ e.g.. 



where = (^j*, • • • , Cn)'^ represents the fith memory pattern. The terms a^^ and b^^ indicate 
the connections that depend on pre- and postneurons, respectively. These terms can be con- 
sidered to be noise to neurons.'' Since the preneuron-dependent connection leads to correlated 
firings of neurons, we take particular note of the term a^^ . Moreover, we reduce this term to 
one that is independent of the memory patterns, Wj. 

In this paper, we discuss a sequential associative memory model that is also a recurrent 
neural network. ^"^^ The associative memory model stores memory patterns in the synaptic 
connections; that is, the synaptic connections arc not uniform, but they do have a structure. 
Moreover, the synaptic connections are time invariant, unlike those for layered networks. We 
found, however, that time correlations of states in this model can be ignored when a mem- 
ory pattern is retrieved, since the model retrieves a different pattern sequentially each time. 
The common synaptic inputs at different times can, therefore, be assumed to be indepen- 
dent. Under this consideration, we have succeeded in deriving a recurrence relation form of 
the probability density function for macroscopic states in the sequential associative memory 
model. 



(1) 
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2. Sequential Associative Memory Model 

Consider a sequential assoeiativc memory model^ consisting of N units or neurons. The 
state of the units takes xj = ±1 and is updated synchronously by 

where the output function is F{h) = sgn(/i), and Jij is a synaptic connection from the jih. 
neuron to the ith. neuron, and given by 

fi=0 

where = The first term on tlic rhs represents the coupling as in the existing sequential 
associative memory model. ^"""^^ It stores p random patterns = (^j*, • • • , C^)"^ so as to retrieve 
the patterns as ^ ^ . . . ^P-^ ^0 gcquentially. The second term on the rhs, Wj, 
represents preneuron-dependent coupling. From eqs. (2) and (3), we obtain 

(p-l N ^ \ 

M=Oj=l j=l I 



Let the second term on the rhs be 

TV 

r]t = ^Wjx'j. (5) 

We call r]t the common synaptic input, since it is independent of index i and affects all neurons 
equally. Of course one can consider rft to be an external input coming from the outside system, 
which might be independent of preneurons x*-. In order to analyze the dynamics theoretically, 
we will assume that the coupling wj obeys the Gaussian distribution with J\f{0, S'^/N). There- 
fore, the common synaptic inputs act like noise in this case. 

The number of neurons is given hy p = aN . We call a the loading rate. Each component 
of the memory patterns is assumed to be an independent random variable that takes a value 
of either +1 or —1 according to the probability 

Prob[^f = ±l] = i. (6) 

We define the overlap by the direction cosine between the state x* and the retrieval pattern 
^* at time t, 

N 



^t = ]^Y.^lxi (7) 

1=1 

and determine the initial state a;° according to the probability distribution 

Prob[x° = ±1] = (8) 
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Therefore, the overlap between the pattern and the initial state is mo- The network 
state £c* at time t is expected to be near the pattern ^* when the initial overlap mo is large 
and the loading rate is under its storage capacity. 

3. Theory 

3.1 Macroscopic state 

Let us derive the macroscopic state equations in the case of dynamics with sample depen- 
dence. Prom eqs. (4)-(7), we obtain 

= + + (9) 



^EEer^^M, (10) 

IJ.¥=t 3=1 

where zj is a crosstalk noise term. We assume that the crosstalk noise obeys the Gaussian 
distribution with mean and variance according to the statistical neurodynamics.^' 
The state x^j^^ is expanded in terms of as follows: 

N 



E^r'i^^{^rmt+z]+m), (11) 



N 



= 4 E E^rW'''^^ + ^ E E^r^CMt/*^!, (12) 



fi^t+l j=l k=l 



where 



T''^'^ = F(l^EE^r'^k4 + v], (13) 

1 ^ 

= ]^E^'(^r"^*+4+^t)- ^^^^ 



Therefore, the variance of the crosstalk noise becomes^^'^^ 



{zf^'f =a + Ul,al (15) 



Let us next consider the initial state to be = . In this case, since the pattern ^* is 
retrieved at time t and the memory patterns are independent of each other, we can assume that 
a;* ^* and the state x\ can become independent of each other. When the correlation between 
Wj and X*- can be neglected, the common synaptic inputs rjt are independent with respect to 
time t. Therefore, rjt are iid and they obey the Gaussian distribution with A/'(0, 5^). First of 
all, we will discuss the case in which r]t is given. Here, mt+i and at+i can be represented as 
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functions of rrit, at and ijt : 

mt+iimt,at,m) = Jd, {^'+^F {^'+^mt + atz + rjt))^ , (16) 

= ^[erf(n) + erf(^;)], (17) 

Ut+i{mt,at,vt) = J D,{F' {^*+'^mt + atz + r]t))^, 

= -^[exp{-u^)+exp{-v^)], (18) 

ajj^^{mt,at,r}t) = a + U^^^imt, at,r]t)af{rnt-i,at-i,'nt-i), (19) 

where = exp (^~^) and u = (mt + r^t) l^f^at-, v = (mt — rjt) /\/^at; (•)^ denotes the 
average over ^. 

3.2 Probability density function 

From cqs. (17) (19) we can evaluate the dynamics for various vahics of 5. In the case of 
6 = 0, the behavior is deterministic as with the existing sequential associative memory model. 
In the case oi S > 0, the behavior changes drastically. Furthermore, mt and at are distributed, 
and these distributions are described as the probability density function p{mt,at,r]t)- As 
described above, since mt and at are independent of rjt, the probability density function is 
decoupled as 

p{mt,at,r]t) =p{mt,at)p{rit) . (20) 
We can, therefore, obtain the PDF by 

p{mt+i,at+i) = J dmtdatdr]tp{mt,at)p{r]t) 

X 5 {mt+i - mt+i{mt, at,r)t)) 6 {at+i - at+i{mt,at,r]t)) , (21) 
where d{-) denotes the Dirac's delta function. The PDF of pij^t) is given by 

We combine the terms of r]t into kernel function K {nit+i, o"t+i; mt, at): 

p{mt+i,at+i) = Jdmtdatpimt,at)K {mt+i,at+i;mt,at), (23) 

K {mt+i,at+i;mt,at) = Jdr]tpivt) S {mt+i - mt+iimt,at,T]t)) 

xS{at+i- at+i(mt,at,rit)) . (24) 

The kernel function K {mt+i,at+i;mt,at) can be evaluated analytically. Let m^,cr^ be the 
value of mt,at satisfying eqs. (17)-(19). Then, K {mt+\,at+i;mt,at) becomes 



r 27rr/| J It: a + (e""' + e"^^ ) ^ 
K (mt+i, (7t+i; mt, at) = / drjtp {rjt) .2 / ^ _^2^ _^2_^2 ^ (25) 



{u - v)^ (e-"'' + e-"') e-«^-^^ 
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Fig. 1. Time evolutions of overlap without common synaptic inputs {6 = 0) for mo = 0.30 and 0.45, 
where a = 0.20 and N = 5, 000. 

where u = (m^ + rjt) and v = (m^ — rjt) /V^CTt- Our PDF agrees with the PDF for the 

layered associative memory model obtained by Yamana and Okada.^ 

4. Effect of Common Synaptic Inputs 

We demonstrate the effect of the common synaptic inputs in our model with computer 
simulations. The effect of the inputs depends on the variance (5^. In the case of 5 = 0, the 
dynamical behaviors are uniquely determined according to the initial states. On the other 
hand, in the case oi S > 0, since there exists the correlation between the inputs to neurons, 
sample dependence arises. That is, the dynamical behaviors are not determined according to 
the initial states, and the model either succeeds or fails to retrieve the memory pattern from 
the same initial state. 

First, we show the time evolutions of overlap when there is no common synaptic input 
{6 = 0). Figure 1 shows 30 samples of overlap nit for initial overlaps mo = 0.30 and 0.45, 
where the loading rate is a = 0.20 and the number of neurons is = 5, 000. While there 
are fluctuations in the overlaps, they arc caused by a finite number of neurons; the larger the 
number of neurons is, the smaller the fluctuations are. Therefore, no sample dependence can 
be seen in the case oi S = 0. 

Next, we show the time evolutions of overlap for various S values in order to find the effect 
of the common synaptic inputs. Figure 2 shows the overlaps for S = 0.1,0.2 and 0.3. Wc use 
an initial overlap of niQ = 1 in order to discuss the stability of the memory state. For small S 
values, as in Figs. 2(a) and 2(b), the stored patterns can be stably retrieved. For 6 = 0.3 as 
in Fig. 2(c), however, the network gradually reaches away from the memory state in many of 
the samples. 

Furthermore, we verify that a memory state is an attractor when the memory state is 
stable. Figure 3 shows the time evolutions of overlap for 6 = 0.20, where mo = 0.30 and 0.45, 
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Fig. 2. Time evolutions of overlap with common synaptic inputs for (a) S = 0.1, (b) S = 0.2, and (c) 
6 = 0.3, where a = 0.20 and toq = 1.0. 



a = 0.20 and N = 5, 000. The figure shows 30 samples of different trials. Whereas the network 
reaches the nonretrieval state for all samples in the case of nio = 0.30, as in Fig. 3(a), it reaches 
either the retrieval or nonretrieval state depending on the samples in the case of tuq = 0.45, as 
in Fig. 3(b). From these results, the memory state is the stationary state of the network, and 
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Fig. 3. Time evolutions of overlap for (a) mo = 0.30 and (b) mo = 0.45, where a = 0.20 and S = 0.20. 



it has a finite basin of attraction. Prom the same initial state the network can reach different 
attractors by the common synaptic inputs, which means that sample dependence exists and 
that self-averaging breaks down.^'^ 

5. Probability Distribution 

From Figs. 2 and 3 wc can sec the sample dependence by the common synaptic inputs. 
We therefore need to discuss the distribution of macroscopic states instead of the behavior 
for each trial in order to analyze the behavior of the network; that is, we must discuss the 
probability distribution of overlaps. From eq. (23), the probability distribution at any time 
t can be evaluated when the initial distribution p{mo,ao) is given. Here, we introduce the 
marginal probability distribution, p (mt), which is integrated with respect to at : 

P i^t) = J datp (mt, at) . (26) 

We analyze the probability distribution of overlaps at time t = 5,30 and 90 in Fig. 3(b). 
Figure 4 shows the marginal probability distribution obtained by our theory and histograms 
obtained from the computer simulations. The lines denote the results obtained from eq. (26), 
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and the boxes denote the histograms for 1,000 samples obtained from the computer simula- 
tions {N = 5, 000). In the cases of Figs. 4(a) and 4(b), the results obtained by the theory agree 
with those obtained by the computer simulations. On the other hand, in the case of Fig. 4(c), 
both results agree at nit ^ 1, but the distribution by the computer simulations spreads at 
mt ~ 0. In the nonretrieval state, the assumption of ~ ^* may not be satisfied, in which 
case the time correlation may not be ignored. We have, however, verified with the computer 
simulations that has no time correlation in the nonretrieval state. Figure 5 shows the time 
evolutions of overlap nit and the time correlation coefficient for an initial overlap niQ = 0.10, 
where a = 0.20, 5 = 0.20 and N = 20, 000. The error bars and the line represent the average 
and standard deviation of the time correlation coefficients and the average of overlaps over 
20 trials, respectively. The network state goes to the nonretrieval state, since the overlap nit 
becomes zero. The time correlation coefficients are calculated using the states and x*"*"^. 
Since they are almost zero, cc* has no time correlation. 

Another possible source of disagreement in the nonretrieval case is in the Gaussian as- 
sumption of the crosstalk noise. Although in the sequential associative memory model without 
the common synaptic inputs the crosstalk noise obeys the Gaussian distribution even in the 
nonretrieval case,-*^^ it is difficult to show that it is the Gaussian in the model with the common 
synaptic inputs. Therefore, we suppose that the fluctuation at ~ is caused by either the 
breakdown of the Gaussian assumption or the fact that there is a finite number of neurons, 
as is the case with 6 = (Fig. 1). 

6. Summary 

Correlated firing such as that by synfire chains is a noticeable phenomenon. The mech- 
anism will be elucidated by theoretical models in the future. We discussed the effects of the 
common synaptic inputs in a sequential associative memory model. In this model, correlated 
firing occurs because the input to each neuron has a correlation due to the common synaptic 
inputs; therefore, sample dependence exists. We verified the existence of sample dependence 
via computer simulations. In order to investigate the correlated firing, we need to analyze 
theoretically novel phenomena caused by the sample dependence. However, we were unable to 
use the independence of units or neurons at the thermodynamic limit. Moreover, in recurrent 
neural networks, theoretical treatment is much more difficult because of feedback connections. 
We therefore considered the sequential associative memory model, in which time correlation 
can be ignored, allowing us to derive a recurrence relation form of the PDF at the macroscopic 
state. The probability distributions obtained by our theory agree with those obtained by the 
computer simulations. 

We analyzed the sequential associative memory model that had common synaptic inputs. 
However, it may be hard to rigorously analyze models such as autoassociative memory models 
since the time correlation cannot be neglected. 
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Fig. 4. Marginal probability distribution at (a) f = 5, (b) f = 30, and (c) t = 90, where a 
0.20, mo = 0.45 and 6 = 0.20. 
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Fig. 5. Time evolutions of overlap and time correlation coefficient, where a = 0.20, mo = 0.10, ^ 
0.20. 
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